Impact of Chip-Level Integration on Performance of OLTP Workloads

نویسندگان

Luiz André Barroso

Kourosh Gharachorloo

Andreas Nowatzyk

Ben Verghese

چکیده

With increasing chip densities, future microprocessor designs have the opportunity to integrate many of the traditional systemlevel modules onto the same chip as the processor. Some current designs already integrate extremely large on-chip caches, and there are aggressive next-generation designs that attempt to also integrate the memory controller, coherence hardware, and network router all onto a single chip. The tight coupling of these modules will enable efficient memory systems with substantially better latency and bandwidth characteristics relative to current designs. Among the important application areas for high-performance servers, online transaction processing (OLTP) workloads are likely to benefit most from these trends due to their large instruction and data footprints and high communication miss rates. This paper examines the design trade-offs that arise as more system functionality is integrated onto the processor chip, and identifies a number of important architectural choices that are influenced by chip-level integration. In addition, the paper presents a detailed study of the performance impact of chip-level integration in the context of OLTP workloads. Our results are based on full system simulations of the Oracle commercial database engine running on both in-order and out-of-order issue processors used in uniprocessor and multiprocessor configurations. The results show that chip-level integration can improve the performance of both configurations by about 1.4 to 1.5 times, though for different reasons. For uniprocessors, integration of the L2 cache and the resulting lower hit latency is the primary factor in performance improvement. For multiprocessors, the improvement comes from both the integration of the L2 cache (lower L2 hit latency) and the integration of the other memory system components (better dirty remote latency). Furthermore, we find that the higher associativity afforded by integrating the L2 cache plays a critical role in counteracting the loss of capacity relative to larger off-chip caches. Finally, we find that the relative gains from chip-level integration are virtually identical for in-order and out-of-order processors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Detailed Comparison of Two Transaction Processing Workloads

Commercial applications such as databases and Web servers constitute the most important market segment for high-performance servers. Among these applications, on-line transaction processing (OLTP) workloads provide a challenging set of requirements for system designs since they often exhibit inefficient executions dominated by a large memory stall component. A number of recent studies have char...

متن کامل

Analyzing the Impact of System Architecture on the Scalability of OLTP Engines for High-Contention Workloads

Main-memory OLTP engines are being increasingly deployed on multicore servers that provide abundant thread-level parallelism. However, recent research has shown that even the state-of-the-art OLTP engines are unable to exploit available parallelism for high contention workloads. While previous studies have shown the lack of scalability of all popular concurrency control protocols, they consider...

متن کامل

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...

متن کامل

ThunderGeckoMonkey: An Energy-Aware High Performance Secure Computing System

This paper presents ThunderGeckoMonkey, a high-performance, energy-efficient processor aimed at On-Line Transaction Processing (OLTP) workloads. ThunderGeckoMonkey also offers enhanced, transparent fault tolerance and hardware support for a wide range of common encryption and hashing algorithms. A chip multiprocessor, ThunderGeckoMonkey is based on 4issue out-of-order cores, each utilizing adva...

متن کامل

The impact of interwoven integration practices on supply chain value addition and firm performance

Drawing on the supply chain (SC) management literature, this article conceptualizes and empirically tests a framework that shows how both external and internal integration practices are significant and positively associated with SC value addition and firm performance. The framework also tests the impact of value addition as a reinforcing factor on firm performance. The outcome of this investiga...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Impact of Chip-Level Integration on Performance of OLTP Workloads

نویسندگان

چکیده

منابع مشابه

A Detailed Comparison of Two Transaction Processing Workloads

Analyzing the Impact of System Architecture on the Scalability of OLTP Engines for High-Contention Workloads

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

ThunderGeckoMonkey: An Energy-Aware High Performance Secure Computing System

The impact of interwoven integration practices on supply chain value addition and firm performance

عنوان ژورنال:

اشتراک گذاری